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Abstract 

In this worljjl the joint precoding across two distant transmitters (TXs), sharing the knowledge 
of the data symbols to be transmitted, to two receivers (RXs), each equipped with one antenna, is 
discussed. We consider a distributed channel state information (CSI) configuration where each TX has 

m : 

J> , its own local estimate of the channel and no communication is possible between the TXs. Based on the 

distributed CSI configuration, we introduce a concept of distributed MIMO precoding. We focus on the 



o 



x 



high signal-to-noise ratio (SNR) regime such that the two TXs aim at designing a precoding matrix to 



cancel the interference. Building on the study of the multiple antenna broadcast channel, we obtain the 
following key results: We derive the multiplexing gain (MG) as a function of the scaling in the SNR 



of the number of bits quantizing at each TX the channel to a given RX. Particularly, we show that 
the conventional Zero Forcing precoder is not MG maximizing, and we provide a precoding scheme 
optimal in terms of MG. Beyond the established MG optimality, simulations show that the proposed 
precoding schemes achieve better performances at intermediate SNR than known linear precoders. 



I. INTRODUCTION 

One promising solution to answer the need for increased spectral efficiency in the future 
wireless networks consists in the joint transmission from several transmitter (TXs) to serve 
multiple receivers (RXs), so called Network MIMO fl], E0- If all the TXs have access to the 

'This work has been performed in the framework of the European research project ARTIST4G, which is partly funded by 
the European Union under its FP7 ICT Objective 1.1 - The Network of the Future. 



2 

data symbols and to the global channel state information (CSI), the different TXs can then be 
seen as a unique virtual TX serving all the receivers (RXs). The precoding schemes of the 
multiple antenna broadcast channel (BC) can then be applied. 

Yet, this requires the sharing of the data symbol and the CSI between the TXs, which represents 
a high requirement on the network infrastructure. Indeed, while in future wireless networks (e.g. 
LTE Advanced), it is considered to link the TXs with the Core Network via high capacity links 
to share the data symbols with the cooperating TXs, the sharing of the CSI is done through 
limited rate feedback channels and limited capacity signaling (so called X2) links between the 
TXs. Thus, an interesting information theoretic MIMO channel arises whereby multiple TXs 
may access the same data symbols, but have a limited CSI sharing capability. We define this 
channel as the distributed CSI (DCSI)-MIMO channel. 

In the DCSI-MIMO channel, there may be inconsistencies between the different versions of 
CSI seen at the TXs due either to separate compression or separate feedback channels. Such 
inconsistencies can be detrimental to the channel capacity if they are not accounted for in the 
precoding design. This is the object of this work. 

To put this in contrast, note that in the BC, the impact of finite rate feedback [ED-lE! and the 
derivation of robust solutions [ED, [[8j have been the focus of many works, which have been then 
extended to the MIMO network setting J9]|, [[Toll . However, these works only focus on the case 
of imperfect CSI yet perfectly shared between the TXs and do not consider the case when each 
TX has its own imperfect estimation of the multi-user channel, which will be our focus in this 
work. This setting was first studied in [fTTj . and a tractable discrete optimization at finite SNR 
was derived. However, it does not lend itself to a more general performance analysis. 

Our work can be seen as a generalization to the case of distributed CSI setting of the study 
by Jindal 01 of the multiple- antenna BC, in which the Multiplexing Gain (MG) is derived as 
a function of the number of feedback bits by each RX. We here consider only two TX-RX 
pairs, while the generalization to multiple TX-RX pairs is carried out in [[T2l . We consider only 
Zero-Forcing schemes which are known to achieve the maximal MG with perfect CSI in the 
MIMO BC. 

Specifically, the main contributions are as follows. Let's first define the number of bits 
quantizing the estimate at TX j of the normalized channel hf from the two TXs to RX i 

as a\ log 2 (P) with of £ [0, 1]. Then, we show that: 



• The MG achieved with conventional Zero Forcing at RX i is equal to minj je { 12 } otf\ 

• The optimal MG at RX i is equal to max je { 1)2 } a\ j \ 

• We provide a precoding scheme achieving the maximal MG, as well as practical precoding 
schemes outperforming known linear precoding schemes at finite SNR for the DCSI-MIMO 
channel. 

Notations: We denote by n u («) and n^(«) the orthogonal projectors over the subspace 
spanned by u and over its orthogonal complement, respectively. % denotes the complementary 
indice of i, i.e., i — i mod 2 + 1. 

II. System Model 

We first present the classical multicell MIMO model before introducing our novel concepts 
of distributed CSI and distributed precoding. 



A. Multicell MIMO 

We consider a joint downlink transmission from two TXs to two RXs using linear precoding 
and single user decoding. For ease of exposition, the TXs and the RXs are equipped with only 
one antenna, such that the received signal is written as 

\hf\\hfx 



yi 

2/2 



Hx + 



m 




hfx 


+ 


m 








V2 








V2 





I ^"2 1 1 ^"2 *^ 



+ 



Vi 



(i) 



where yi is the signal received at the i-th RX, hf e C lx2 is the channel from the TXs to the 
i-th RX, hf = hf /|| hf || is the normalized channel, rji ~ CAf(0, 1) is the noise at the i-th RX 
and is distributed as i.i.d. complex circularly symmetric Gaussian noise, and x E C 2xl is the 
transmitted signal from the TXs. The channel is block fading and the entries of the channel 
matrix H are distributed as i.i.d. complex circularly symmetric Gaussian with unit variance 
to model a Rayleigh fading channel. The transmitted signal x is obtained from the vector of 
transmit symbol s = [si, s 2 ] T G C 2xl (whose entries are assumed to be independent CAf(0, 1)) 
as 

Sl 



x 



Ts 



«2 



(2) 



where T G C and tj G C is the beamforming vector used to transmit s,. We consider a sum 
power constraint ||T||p = P and we also assume for simplicity and symmetry that t L = ^P/2u,i 
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with || Mj || 2 = 1. Note that normalizing the individual columns does not alter the ability to 
zero-force the interference so that it does not affect the MG. 
We also define the MG at RX i as 

MG *-p lim T^T^ < 3 > 

P^oolog 2 (P) 

so that the total MG is M G = M G1 + M G2 . 

We will study the long-term average throughput over the fading distribution and also over the 
realizations of the Random Vector Quantization (RVQ) codebooks used for the CSI quantization 
(Cf. subsection III-CI) . such that the throughput for RX i reads as 



Ri(P) — Eh,w 



log 2 1 + 



\h H t-\ 2 



1 + 

To achieve the maximal MG we aim at removing all the interference, i.e., at having 



(4) 



Xa(t 2 ) = \h%\ 2 = 0, and l 2 {t 1 ) 4 \h%\ 2 = 0. (5) 

From ©, it follows that the optimization of the two beamforming vectors ti and t 2 can be 
uncoupled. 

B. Distributed CSI 

We assume a limited CSI setting where finite quality channel estimates are obtained from 
quantizing the true channel vectors. The distributed CSI is defined here in the sense that each 
TX has a different estimate of the normalized channel hi from all TXs to RX i. Moreover, the 
estimates for hi and h 2 are also a priori of statistically different qualities. We denote by 
the estimate of the normalized channel vector hi acquired at TX j. Furthermore, the number of 
quantizing bits for h)* is given by 

In the context of MIMO BC, it is shown in [3] that the number of quantization bits should 
scale indefinitely with the SNR in order to achieve a positive MG with ZF. It also holds in a 
distributed CSI configuration so that we focus on the scaling in the logarithm of the SNR of the 
number of quantization bits 

r Cj) 

a? 4 l im (6) 

P-Kx>log 2 (P) 

Since = l,Vi, j G {1,2} is shown later in Theorem [T] to be sufficient to achieve the maximal 
MG, we will always consider that af " 1 G [0, 1]. 
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C. Random Vector Quantization 

We consider the performances averaged over codebooks used to quantize the channels ran- 
domly generated. This follows a result in fl3] stating that in the case of two antennas at the TX, 
no codebook can achieve in the single TX case a better MG than the MG achieved with RVQ. 
Moreover, RVQ is shown to be optimal as the number of antennas tends to infinity at the TX 
and the RXs |fl3l 

However, in the MIMO BC, a codeword c is selected to quantize h if it maximizes the inner 
product \h H c\ over the codebook. Any other codeword of the form ce^ where <\> is any real 
number achieves the same performances and can be selected indifferently. This is problematic 
in a distributed setting since we are now interested in — h\ 2 ^\\ and even if the codewords 
at TX1 and TX2 are e^ x hi and e^ 2 hi respectively, i.e., exactly in the direction of hi, the two 
estimates differ greatly in norm. 

Our solution is for each codeword and each channel estimate to choose e- 7 * as the complex 
conjugate of the first vector element divided by its absolute value, thus making the first vector 
element real valued. Because of this choise, the quantization scheme is not any longer in the 
Grassmann manifold and we have to consider the isomorphisme between C and IR 2 . Thus, for 
the quantization, each complex vector is considered as a vector of IR 4 made of the stacked real 
and imaginary parts. Moreover, since the first coefficient is real valued only, we have to consider 
in fact R 3 only. A vector u £ C 2 with is first coefficient real valued is represented in IR 3 as i%3 
and is defined as 

Re(u/ 



A 

U R 3 = 



(7) 



Re(ii 2 ) 
Im(u 2 ) 

Thus, we define the angles between Ur3 and in IR 3 as 

( I U U 3 '- jr. i i o 

arccos \- ,r . (8) 



Finally, the estimate h^' is chosen as the element of the random codebook W which maximizes 



the co sinus of the angle between the codeword in IR 3 and the true channel in 

,0) _ /(. U mn vl«T. 



d3. 



h\^ 3 = argmaxcos(Z(ci;3, h m s)) = argmax \c R3 h iR 3\. (9) 

c r3 eW R3 c r3 ew R3 
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D. Distributed Precoding 

In the distributed CSI setting, each TX has a different estimate of the channel, which it uses 
to compute the precoding matrix. We denote the overall multi-transmitter precoder computed at 
TX j as 
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where is the beamforming vector transmitting s« computed at TX j. Note that although a 
given TX j may compute the whole precoding matrix only the j-th row will be used in 
practice since the other row corresponds to the coefficients being implemented at the other TX. 
Finally, the effective precoder is given by 
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III. Main Theorems on the Multiplexing Gain 



In the multiple antenna BC with perfect CSI, ZF achieves the maximal MG and can be 
conjectured to be also optimal with imperfect CSI. The central question of this paper is whether 
this result still holds in the DCSI-MIMO channel, and what are otherwise the MG optimal 
precoding strategies. 



A. Conventional Zero Forcing 

The conventional ZF precoder applied distributively consists in transmitting symbol i with the 
beamformer tf^ = [t c ^ F ^\ £ 2 f F ^] T > with its elements defined in an intuitive maneer as 



^cZF(i) A 
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The interpretation behind conventional ZF is that each TX applies ZF with its own CSI implicitely 
assuming that the other TX shares the same CSI estimate. Our first important result given in the 
following theorem relates the MG achieved with such a precoding strategy. 



Theorem 1. Conventional ZF achieves the following MG: 
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Proof: A detailed proof is given in Appendix IVI-B1 ■ 

Corollary 1. Conventional ZF achieves the maximal MG if and only if the CSI scaling is identical 
across the RXs and the TXs, i.e., 

\/i,j,£,ke {1,2}^^ = a®. (14) 



Proof: The corollary follows from the comparison between the MG given in Theorem \T\ 
and the MG achieved in a multiple antenna BC with imperfect CSI of the same quality J3]|. ■ 
It means that if the quality of the CSI is the same across all the TXs, it is in fact sufficient 
to apply conventional ZF. Even though it might seem a trivial result, it is not since additionnal 
error arise due to the fact that the estimates are not shared. The quality of the CSI is the same 
but estimates are different. This corollary also shows that the additionnal errors due to the CSI 
inconsistency do not lead to any further loss in MG. 

B. Robust Zero Forcing 

Comparing the MG in Theorem Q] and in a multiple antenna BC 0, it appears that in the 
case of imperfectly shared CSI, the MG is limited by the worst quality of the CSI across the 
channels to the RXs and across the TXs, which is a very pessimistic result. Robust precoding 
schemes have been derived in the literature either as statistical robust ZF precoder or precoder 
optimizing the worst case performances [7] to reduce the harmful effect of the imperfect CSI. 
However, the robust versions improve the rate offset but do not have any impact on the MG. 

C. Beacon Zero Forcing 

Robust ZF schemes from the literature do not bring any MG improvement. This leads us to 
investigate other schemes more adapted to the DCSI-MIMO channel. Thus, we now propose a 
modification of the conventional ZF scheme which improves the MG when the estimates for 
hi and h 2 are of different qualities. We call it Beacon ZF (bZF) because it makes use of an 
arbitrary vector known at both TXs. 



The beamformer used to transmit symbol i is then tf ZF = [£if , t^f ] T , with its elements 
defined as 



jbZFO) ^ 
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where is a vector chosen beforehand and known at the TXs. Due to the isotropy of the channel, 
the choice of is arbitrary and does not influence the performances of the precoder. 



Corollary 2. The MG achieved with beacon ZF is 



M, 



; = mm a) + mm a, 
je{i,2} je{i,2} 
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Proof: The MG follows easily from Theorem [T] Indeed, when using beacon ZF, no error 
is induced by the projection of the direct channel which is replaced by a fixed given vector. In 
terms of MG, there is no difference between projecting the direct channel or any given vector. 
Thus, we can ■ 

Corollary 3. Beacon ZF achieves the maximal MG if and only if Vi G {1,2}, = af \ Thus, 
the inconsistency in the channel realizations between the TXs does not reduce the MG. 

Proof: Let consider that the TXs are allowed to cooperate, then any of the estimates can be 
used and the other thrown away. The channel used in the orthogonality constraint is then known 
with the given accuracy and no ZF can improve the accuracy. Thus, the maximal accuracy is 
achieved using beacon ZF. ■ 
The key idea behind Beacon ZF is to reduce the impact of the difference in quality between 
hi and h 2 by using only the CSI necessary to fulfill the orthogonality constraint and not the 
direct channel which does not change the MG but only improves the finite SNR performances. 
Indeed, t\ ZF does not depend on the estimates of hi and symmetrically t^ ¥ does not depend 
on the estimates of h 2 . 



D. Active/Passive - Zero Forcing 

Beacon ZF improves the MG in some settings but it is still the worst CSI quality across 
the TXs which implies the MG. Thus, we now propose a scheme called Active/Passive Zero 



Forcing (A/P-ZF) to take care of this problem. Assuming wlog that of^ > a- , it consists in 



the precoder whose beamformer to transmit symbol i is given by 

1 

(17) 
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where fcf H 4 [fig) , fig>], p f) 4|fig>|7|/»g>|*, ||uf/ p - ZF || = 1. 

A/P-ZF is based on the idea that each beamforming vector has to fullfill one orthogonality 
constraint so that only degree of freedom is necessary. Thus, one coefficient can be set to a 
constant while still fullfilling the ZF constraints. Additionnaly, the other underlying idea is that 
the only way to achieve the MG steming from the best CSI estimate is if TX 2 (which has the 
best knowledge of hi) can adapt to the transmission done at TX 1 to adjust its beamforming 
vector and improves how the interference are suppressed. This is possible only if TX 2 knows 
the coefficient used to transmit at TX 1 which means that TX 1 should not use its own CSI 
and transmit with a fixed coefficient. The MG using this precoding scheme is then given in the 
following proposition. 

Proposition 1. Active-Passive ZF achieves the MG: 

M a/p-zf > max (i) + max m (lg) 

ie[i,2] je[i,2] 



Proof: Due to the symmetry between the two RXs, we consider only the MG at RX 1, 
and we consider that the beamformers t\ and ti are given by CCD- We still assume wlog that 



Of. 
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> of } , i.e., TX 2 has the best CSI over h x . Using A/P-ZF, the MG at RX 1 reads as 
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where we have used Jensen's inequality for the last inequality. We now consider the interference 
term Xi(t 2 ): 



X 1 (t 2 
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By construction, t 2 is orthogonal to h\' , so that 
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Inserting (|24l) into the MG expression (l20t and using Proposition |3] of Appendix IVI-Al we obtain 
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which is the best scaling across the TXs. ■ 
Comparing the MG achieved with A/P-ZF with the MG achieved when both TXs are allowed 
to exchange their channel estimates, the following fundamental result follows directly. 

Theorem 2. Active/Passive ZF achieves the maximal MG. 

Proof: Let assume that the sharing of the channel estimates is allowed between the TXs. 
Then it is optimal to use the best estimates for each of the channel vector and simply throw 
the other estimate. In that case, the TXs share the same CSI quality and it is optimal to apply 
Beacon-ZF which achieves the same MG as given in [2l This MG is an upper bound for the MG 
so that A/P-ZF achieves the maximal MG. ■ 
A/P-ZF allows to recover the MG which would have been achieved with the sharing of the 
estimates. However, one last point remains to be discussed: the choice of the coefficient used 
to transmit at TX 1. Actually, the beamformer can be multiplied arbitrarily by any unit norm 
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complex number without impacting the rate achieved, so that only the power used at TX 1 needs 
to be decided. In (fT2l) . the power used is set to P/(21og 2 (P)), which follows the fact that the 
channel fo 22 might have a very small amplitude, in which case it would be necessary for TX 2 
to transmit with a very large power to cancel the interference. To ensure that the interference 
are canceled for all channel realizations while respecting the power constraint, it is necessary to 
have the ratio between the power used at TX 1 and the total power tending to zero. The factor 
log 2 (P) is used because it fulfills this property while not reducing the MG due to the partial 
power consumption. 



E. Power Control for A/P-ZF Precoding 

We have seen that A/P-ZF could achieve a much better MG than conventional ZF. However, 
this comes at the cost of using only a small share of the available power, which is clearly 
inefficient and leads to bad performances at finite SNR. To improve the performances, the TX 
with the worst accuracy needs to adapt its power consumption to the channel realizations. In the 
following, we propose two possible solutions. 

• Firstly, TX 1 can use its local CSI to normalize the beamformer which is then given by 



,apZF 
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with p 



(J) A 



1w?\ 2 /\h$\ 2 - This beamformer is not MG maximizing because the local CSI is 
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used at TX 1 so that TX 2 cannot adapt to it to cancel the interference, and the beamformer 

~ (2) 

is not orthogonal to h-. . Yet, this solution achieves good performance at intermediate 
SNR.RVQ 

(2) 

Another possibility is to assume that TX 1 receives the scalar p\ ' (or p^) and use it to 
control its power. This means that either the RX or TX 2 needs to feedback this scalar. It 
requires an additionnal feedback, but only a few bits are necessary, because it is only used 
to improve the power efficiency and does not impact the MG. Thus, the feedback of this 
scalar does not change the scaling of the CSI in terms of the SNR nor the performances, 
and appears thus as an interesting practical solution. 
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IV. Simulations 

We consider two models for the imperfect channel CSI, a statistical model and RVQ. In the 

statistical model, the quantization error is modeled by adding a Gaussian i.i.d. quantization 

(j) (j) 

noise to the channel with the covariance matrix at TX j equal to diag([l/P a i ,1/P a z ]). 
When considering given finite number of feedback bits, we compute = B^/\og 2 (P), 

(?) (1) (i) (?) 

so that diag([l/P a i , l/P a ? ]) = diag([l/2 B i , l/2 B i ]). For RVQ, we consider a number of 
quantizing bits either numerically given or obtained from the CSI scaling as qf' = \_af log 2 (P)J . 
In the statistical model, we average over 10000 realization and for RVQ we average over 100 
codebooks and 1000 channel realizations. In the simulations, we consider the following precoders: 
ZF with perfect CSI, conventional ZF [cf. (fl2]T|. beacon ZF [cf. (fl"5l)l. and Active/Passive ZF 
[cf. (fTTT) ! with heuristic power control and with 4-bits power control. 

In Fig. Q3 we consider the statistical model with the CSI scaling [ ] = [1,0.5] and 

[c4 , c4 ] = [0,0.7]. To emphasize the MG (i.e., the slope of the curve in the figure), we let 
the SNR grow large. As expected theoretically, conventional ZF saturates at high SNR, while 
Beacon ZF has a positive slope and Active/Passive ZF performs close to perfect ZF with a slope 
only slightly smaller than the optimal one. 

In Fig. [2] and Fig. [3] we plot the sum rate achieved with the CSI feedback [B^, B[ 2 ^] = [6, 3] 
and [B% , B%] = [3, 6] for the statistical modeling and RVQ, respectively. Firstly, we can observe 
the good match between the two models used. From the theoretical analysis the MG is null for 
all the precoding schemes for a finite number of feedback bits, which can be observed by the 
saturation of the sum rate as the SNR grows. Yet, the saturation occurs at higher SNR for Beacon 
ZF compared to conventional ZF, and at even higher SNR for Active/Pas sive-ZF, which leads 
to an improvement of the sum rate even at intermediate SNR. 

V. Conclusion 

In this work, the multiplexing gain in a two-cell broadcast channel where the TXs have 
different estimates of the multi-user channel has been studied. We have shown that usual Zero 
Forcing precoding applied without taking into account the differences in CSI quality achieves far 
from the maximal MG. We have also derived the value of the maximal MG in that distributed 
CSI configuration and provided a MG maximizing precoding scheme. Moreover, we have shown 
by simulations that the new precoding approach outperforms known linear precoding schemes 
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SNR [dB] 



Fig. 1. Sum rate in terms of the SNR with a statistical modeling of the error from RVQ using [af 3 ,a$ 2) ] = [1,0.5] and 
[aW,4 2) ] = [0,0.7]. 



at intermediate SNR. We have considered only two TXs and two RXs with a single antenna 
to keep the notations simple, but the extension to multiple-antenna TXs or RXs appears to be 
tractable while the analysis in the case of K TX-RX pairs with a single antenna is done in |fl2l . 
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VI. Appendix 

A. Some Results on Vector Quantization 

In this section, we recall some results on vector quantizations from lfl4l and we derive some 
new properties which will be needed for the following proofs. We consider the angle between 
two vectors as defined in ®. It means that we are considering real vectors of unit norm in the 
linear space M 2n_1 as explained in Subsection III-Cl 

Proposition 2 ( lfl4l . Corollary 2). The cumulative distribution function (CDF) ofd 2 (h,c) = 
sm 2 (Z(h, c)) where c is an element of a random codebook is bounded as 

c 2 n-ix n - 1 < F{x) 4 Pr{sin 2 (Z(ft, c)) < x} < c 2n ^x n -\l - x)^ . (27) 

where h is the true unitary channel 

Proposition 3 ( [14J, Theorem 2). When the size K = 2 B of the random codebook is sufficiently 
large (c^-i 1 - ) 2~ B ^ n ~ 1 ^ < 1 necessary), then it holds that 

E c Mnsin\zCh, c))] < V -^ c -^ 2 - B /^\l + o(l)) (28) 

where h is the true channel and C2 n -i — r(n — l/2)/(r(n)r(l/2)). 

Proposition 4. The expectation of the logarithm of the quantization error is bounded as 



B + log 2 (c 2 „_i) 
; ~\ ^ t c.k 



— log 2 ( minsin (Z(h,c)) 



< B + log 2 (c 2w „i) + log 2 (e) 
(n-1) 



where h is the best estimate over a random codebook of size 2 B , h is the true channel and 
c 2n _! ^ r(n - l/2)/(r(n)r(l/2)). 

Proof: Upper Bound: The derivation of an upper bound follows the same idea as the proof 
in Appendix B of lfl4l which derives an upper bound for the expectation as considered here, 
but without the logarithm. We start by recalling a Lemma from [fT4| which follows easily from 
the definition but brings some insight. 

Lemma 1 ( U4J, Lemma 3). The empirical distribution function minimizing the distorsion over 
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F2.(s) 



a given K is 

if x < 

fsTF(x) iyO<ar<2C* (30) 

1 if x > x* 

where x* satisfies KF(x*) = 1 and F(x) = Pr{sin 2 (Z(/i, c))| < x} is the CDF of the squared 
distance between a random vector and one element of the random codebook. 

Note that Lemma Q] corresponds to the optimal codebook minimizing the average distance 
and correponds thus to a lower bound for the distorsion. We define Z = sin 2 (Z(h, c)) and use 
the fact that the term considered in the expectation is a positive random variable to rewrite the 
expectation in function of its CDF. 

Pr{— log I min sm 2 (Z(h, c)) J > z}dz 



^C,h 



log ( min sin (Z(h, c)) 



cec 



> 



Pr{minsin 2 (Z(/i, c)) < e z }dz 
KPt{Z < e~ z }dz. 



(3D 



Following the same approach as the proof in Appendix B of lfT4l . we define F (x) = c 2n -ix n 1 
and xq so that KF (xo) = 1. Let also define F ub (:r) = C2 n -i£ n ~ 1 (l — x)^ 1 ^ 2 and 

A 



Xub SO 

that KF uh (x uh ) = 1. Finaly. we define F ubub (a;) = c 2n -ix n ~ 1 (l — Xq)^ 1 ^ 2 and a; ubub »u mm 
-^F ubub (x ubub ) = 1. 

It holds by construction that x ub < x* < x since we know from Proposition |2] that F (x) < 
F(x) < F ub (x). Thus, (1 -x)- 1 ' 2 < (1 -xq)- 1 ' 2 for x e [0,x uh ] so that F ub (x) < F ubub (;r) for 
x G [0,x ub ], which finally implies x ubub < x ub . We can then use these relations to upper bound 
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E 



C,h 



— log I minsin (Z(/i, c)) 



< / KPr{Z < e- z }dz 

-log(x*) 



< 



< 



dz + 



log(x*) 



KF(e~ z )}dz 



-log(x*) 



dz + 



Iog(a;*) 



KF uh (e- z )}dz 



- logOiib) 



(32) 



< 



dz + 

-log(x ub ) 

dz + 

■ log(a; u bub) 

dz + 



log(^ub) 



KF uh (e~ z )}dz 



log(x ub ) 



KF uhuh (e- z )}dz 



KF nhuh (e- z )}dz. 



■ log(^ubub) 

We now replace F u bub an d ^ubub by their expressions and evaluate the integral. 



— log ( minsin (Z(h, c)) 



< 



n — 1 
1 



n — 1 
1 



log 
log 



1 - x 



,1/2 



n-1 



n — 1 



KC2n-l 

(log (Kc^O + l) 



+ 



n — 1 



(33) 



for if large enough. Dividing by log(2) yields the final upper bound. 

Lower Bound: To derive the lower bound, we use the lower bound for the CDF given in 
Proposition [2] The CDF has a form very simililar to the CDF for the single TX case considering 
complex quantization so that we can adapt the approach of the proof of Lemma 3 by Jindal in 

to the current setting. Defining also Z = sin 2 (Z(/i, c)), the CDF is given by Pr{Z > z} = 

1 — C2n-\^ n ~^ ) in Proposition |2l We can use this CDF to derive that 



Pr{min sm 2 (Z(h, c)) > z} = 1 — (1 — c 2 „_i.x 



(t»-l)\K 



(34) 
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. A lower bound for the expectation of the logarithm can then be calculated as follows. 



^C,h 



— log I minsin (Z(/i, c)) 



> 



> 



> 



1 - (1 - c 2nr . 1 e-'^) K dz 
^l-Ed)^ 1 )^-!^ 1 ^ (35) 

— y ( K ) (-i)^ 1 ^ = _L_ 

n- 1 ^ I /W V ; k n-V K J 



k=l 



where we have defined f(p) = £* =1 © (-l) fc+1 ^. To compute the value of /(it), we will 
use the following relation given in [15 , Sec. 0.155]. 



E 

fc=0 



n \ a 



k+l 



(a + 1) 



71+1 



kj k + l 



n + 1 



We will now rewrite /(-ft') in order to be able to apply (|36 



fc=i ^ ' 



A' 



E 

k=l 
K-l 

E 



fc'=0 



c 

~K 

K-l 
k-l 

K-l 



k=l 



K - 1 
k-l 



-1) 



fc+l C 2n- 



K-l 



k 



^E 



K-l 

k 



K-l 



-1) 



fc+l °2n-l 

k 



k=l 



("I) 



fc+l C 2n-l 



(_if+ 2 i^i + /(#-!) 



fc' + 1 



;- C2n _! + i) x - 1 



E 

p=i 

K 



1 - (-c 2 „_i + 1) P 



V 



K ^ A 

E— < p E— < 



1 - (-c 2w _i + l) p 

p 



p=l p=l 

Furthermore we have the two following relations: 

K 



log(tf) < V - < log(tf) + 1 



OO 

X 



log(l - x) = - ^ — , for x G [-1,2]. 



(36) 



(37) 



(38) 



k=l 
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Using these properties and dividing by log(2), we can obtain the final lower bound as 

K * . K 



log ( minsin (Z(/i, c)) 



> 



n-l)log(2)^p (n-l)log(2)^ 



1 - c 2 



> 



E 



log 2 (*0 1 
(„ _1) („ _ 1) i og (2) ^ 

lQg 2 (^) + log2( c «) 

(n-1) 



[1 - c 2 , 



n—l, 



P 



n-1 j 



P 



(39) 



B. Proof of Theorem [7] 

Proof: The proof generalizes the proof of Theorem 4 in Appendix IV of 0), which derives 
the multiplexing gain for the single TX case, to the distributed CSI configuration. However, an 
important difference, which makes the derivation more intricated, consists in the fact that we 
are not only interested in the inner product between the estimate and the true channel but also 
in the difference in norm between the estimates at the two TXs. This is the reason why we 
need to consider Grassmanian beamforming in IR 2 ™ -1 instead of C n , as already introduced in 
Subsection III-CL 

We start by deriving two lemmas which form in fact the core of the proof. 
Lemma 2. Let the beamformers i4 and be computed at TX 1 and TX 2 respectively, then 



u 



(2) 



U 



(1) 



<C UB . max (sinV^FU*)) 

1=1,2,3 = 1^2 



(40) 



where Cub is some positive constant. 



Proof: Since the norm is conserved when considering the isomorphisme between C and 
M 2 , we work in fact in R 2 for the rest of the proof, even though we keep the same notations. 
The beamformer difference can then be rewritten as 



u 



(2) 



U 



(1) 



(i n - h?h?y h v 



a, 



(i n -h?h?Yh^\\ \\(i n -h?h? T )hV\ 

We now decompose the estimates at TX 2 over the estimates at TX 1 as 

,(2)_TT^ fu( 2 h i ( ^( 1 ) T i,(2)NL(l) 



h 



(2) 



4 2) : 



m m {h^') + {h^ i h^)h 



(41) 



(42) 



21 



Using (1421) . we can write (hf^ r h2 ) as 



nt w (h?) + (h\^h?>W>) h 



T _ 



(2) 



(43) 



In a first step, inserting only (1431) in (I4TT) . we can obtain the upper bound 



(i) 



(2) _ 



,( 1 ) T »,( 2 )\^( 1 ) T TT-L 



.( 2 h 



,( 1 ) T ^;( 2 )^^( 1 ) T ^,( 1 )^r^;( 1 ) T i;( 2 )^ 



hi 



(2) 



U 



(1) 



< 



(2) 



£2 



where we have defined 

A 



£l ^|(n^(^)) T ^|/||(i n -M 2) M 2)T )^ 2) ll 

e 2 4 kM^jM^c^ji/iio, - M 2) M 2)T )^ 2) i 



(44) 



(45) 



We derive now an upper bound for £i and the same approach will hold also for e 2 and for other 
similar expressions later in the proof. 



UM 2) ) 



< 



ll(i, 



||(I„-^^ T )^| 

|sin(Z(hWfe( 2) ))| 



(46) 



We further use (1421 ) to rewrite fe^ and fej an ^ obtain an upper bound for the first term of the 
right hand side of (|44|) that we denote by A. 



A = 



h 



(2) 



,(2) 



11(1. 



M 2) M 2)T )/4 2) i 



u 



(1) 



n^(M 2) ) + (^^D^ 



.(i) T f,( 2 h£(i) 



(I n -hf^ T )fcfl 



— it 



(i) 



< 



,(l) T ^( 2 )^(l) 



U 



(1) 



£3 + £4 



(47) 
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where we have further defined 



£3 = \\^tm(h. 



£4 



(2) 



M 2) M 2) Wl 



(48) 



Both £ 3 and £4 can be handled similarly to e\ in (1461) . The norm in the denominator is now 
rewritten in terms of the estimations at TX 2 and approximated by using that the difference 
between the estimates is small compared to one since the accuracy of the CSI is increasing with 
the SNR. 

1 



1 



\\{Y n -h^Ur)hf\ 



where we have introduced 

a r( 

£ = hn 



1 



(M 1)T fc?w* 



(49) 



(50) 



The term with e can be upper bound using the triangular inequality and further upper bounded 
by the product of the norm. Finally, the norm of e can be upper bounded by using the same 
steps that have been used for the numerator, i.e., by projecting the estimates at TX 2 over the 
estimates at TX 1 and upper bounding the terms. Using the result of the side calculation in (|49l) , 
we can now write an asymptotic bound for the first term in the right hand side of (|47l ) which 
we denote by B. 



B = 



< 



;«T£;(2K/r(i)Tr(i)wr(i)Tr(2) 



(i)Tr(2),r(i) 



(i)Tr(ihr(i) 



< 



(h 



(i)Tr(2), 

2 ,l 2 1 



,(i)Tr(i). 



cos(Z(h«,hW)) 

2sin 2 (Z(^ 1) ,^ 2) )/2) 
sin 2 (Z(^ 1) ,^ 2) )) 



1 - (cos(Z(/ l i 1) ,^ 2) ))) 2 co S (Z(^ 1) ,^ 2) ))) hfhf 



l-(l-2sin 2 (Z(^ 1) ,^ 2) )/2)) 2 (l-2sin 2 (Z(^,^ J )/2)) 



(!) KPh 



cos^Z^Uf)) 



.sinV^Ui 2 ')) , si n 2 (Z(^,hf )) 



cos^Z^,^)) cos 2 (Z(^ i; ,^)) 



(i) £(ah 



Putting all the pieces together, we have shown that 



7/ 



(2) 



7/ 



(1) 



<Oi| sm(Z(h^\ hf ] ))\ + a 2 \ sm(ZW,h^>))\ 



(1) L(2)> 



(51) 



(52) 
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which we now want to relate to the inner product with the true channels hi and h 2 . Wlog we 
focus on the term | sm(Z(hf'\ h^))\. 

\MAh?M ) ))\ = y/i-\h? yr h^ 



= m M 

hi 



(2)^ 



= lin 



- i(1)V n^(^)) + n^ (1) ((^^i)^) 



.(2)> 



,(2)Tj 



= ||n- (1) (n-- i (hS 2) )) + n- (1) ((^^i)^)ll 

< ||n^ (1) (n^(M 2) ))|| + ||n^ (1) ((M 2)T ^i)ll 
<||n^(M 2 ))|| + ||n^ (1) (h 1 )|| 

<\sm(Z(h?\h 1 ))\ + \sm(Z(h?\h 1 ))\ 

< 2| sin(max(Z(hS 1) , hi), Z(h { ?\ hi)))\. 

This holds for the two channels vectors hi and h 2 so that taking the maximum over all the sinus 
and choosing the multiplicative constant as the sum of the multiplicative constants we obtain 
the result of the lemma. ■ 

Lemma 3. Let the beamformers u 2 ^ and u^ 1 be computed at TX 1 and TX 2 respectively, then 



;(2)t; 



(53) 



log 2 



u 



(2) 



U 



(1) 



> E 



log 2 (C LB . max (sm 2 (Z(fc?U;)) 

»=1,2,J=1,2 



(54) 



Proof: Wlog we consider that the worst CSI is obtained at TX 1, and we let a geni gives 
perfect CSI to TX 2 of the full channel. Then, we consider two different cases depending on 
whether the worst quality of an estimate is about hi or h 2 and we consider that a geni then 
gives perfect CSI of the channel which has not the worst acurracy. 

Least accurate estimate is an estimate on hi: In that case, the difference reads as 




We then rewrite nf (1) (h 2 ) as 

{h 2 ) = n± (n^ {h 2 )) + (hjiifa (h 2 )) hi. (56) 
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which we can insert in (1551) to obtain 



u 



(i) 



u 2 



n 



hi 



hi 



|n^(fca) 



(57) 



We lower bound the expression by neglecting the first term to obtain the following expression. 



u 



(i) 



u 2 



> 



l|n^,(/i 2 ) 



n^ (1) (^) + (M 1)T ^)M ij n£ (1) r> 2 



(i) 



1 T 



l|n^(h 2 



(58) 



= isin^fUi))! (^(/(n^^o)^^,^))) . 

The two vectors in the cosinus are i.i.d. isotropic in the n — 1 -dimensional subspace orthogonal 
to hrp so that the cosinus is /3(l,n — 2) distributed. It is also independent of and the 
expectation can be taken independently. 

Least accurate estimate is an estimate on h 2 : In that case, the difference reads as 



u 



(i) 



u 2 



2 , 



(59) 



We rewrite the term II ^ (flip) as 

n^(^ 1} ) = n£ (n^(h^) + (h^k 

so that the ratio of norms can be written as 



(60) 
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Considering that the term (h^) is small compared to one since the accuracy increases with 



the SNR, we can approximate the numerator using a Taylor approximation. Thus, 



\m (h 



KM) n ^ 2) 



(62) 



Inserting d60|) and (J62J) in d59j), we obtain 



(i) 



> 



K K(« x) ) 



i - (/4 1)T /* 2 ) + 



i - 



|n^(^)ll 2 



n±(h 2 ) 



sin(Z(/^h2))||sin(Z(hi,»))| 



(i - cosczc^w , hi))) n+ (h 2 ) + cos(z(n^ (n± (^)), n+ (h 2 )))|||n^ (h 



«^ TT-L 



4 1} ) 



sin(Z(^ 1) ,/i 2 )) 



S m(Z(h 1 ,v))\-\^sin(Z(hi 1 \h 2 ))\\n± i (h 2 )\\-^^ 



(63) 



where i; = 11^ (h^^/WUj^ (/i^Oll and is isotropically distributed in the (ri— l)-space orthogonal 
to h 2 . 

The second factor multiplying factor is non-zero with probability one and computing the 
expectation of this logarithm gives the result. ■ 

We will now use the two previous lemmas quantifying the norm difference between the 
beamformers computed at the TXs to show that the MG given in the theorem is a lower and an 
upper bound for the MG achieved. We start from equation (23) in (3]| which is obtained via basic 
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manipulations using the isotropy of the channel and is written here adapted to our notations as 



M G f F = 1 - lim E H , W 

P— >oo 



— lim 



log 2 ( 1 + J H/iifl/iftial 2 
log 2 (P) 

log 2 ( |^u 2 | 2 



log 2 (P) 



where we write for simplicity u 2 instead of u 



cZF 



(64) 



Multiplexing Gain Lower Bound: To obtain a lower bound for the MG, we need to derive 



an upper bound for \h^u 2 \. We define the selection matrix E 2 = diag( 
interference term as: 



1 



and rewrite the 



i,H„, _ iH„(l) , tt. /. ,(2) (1) 



< 
< 
< 



h*v!p 



(65) 



< 



E 2 (u 2 



(2) 



U 



(1) 



Applying Lemma [2] we obtain the bound 



u 



(2) 



U 



(1) 



(2) 

u\ — U 



(1) 



<C { ^- {1) max(|sin(Z(^A))| 



while we can also apply the lemma for 
CSI, to write 



with ||772 II < Cub max i=i,2 ( | sin(Z(/i J liJ , hi))\ ) and Cyg is a positive constant. Thus, 



u 



U 



(1) 



(1) 



i=l,2 



U) 



(66) 



u 2 + ry 2 



with u 2 the ZF beamformer with perfect 

(67) 



h^u 2 



< 



< 



it 



(2) 



(2) 



U 



(1) 



7/ 



(1) 



(68) 



< Cub max ( I sin( Zfh'f', hi 

i=l,2y=l,2 v 



Note that the vector 1x2 is not exactly unitary due to the lack of coordination between the TXs. Indeed TX 1 normalizes its 
coefficient by ||i4 ll> which is a priori not equal to ||n 2 2 ' ||. However, we consider only a number of feedback bits scaling with 
log 2 (P) since otherwise the MG in a single TX configuration is zero and thus also in a distributed CSI configuration. Hence, 
the accuracy of the normalization improves with log 2 (P) and the power constraint is fulfilled with an accuracy increasing in 
the SNR which is sufficiently good for practical systems. 
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with C VB = C§1~ {1) + Cfg. Then, inserting d68) in © gives 



M G f F = - lim E &liW 



log 2 (Jh^u^ 2 



> — lim E 



P— >oo 



hi,W 



log 2 



hfu? 



(2) (1) 
«2 - U 2 



2\ 1 



> — lim Er 



P— >oo 



log 2 (P) 



log 2 ( (C UB + 4b) max i=li2y=1 , 2 ( sm 2 (Z(^, ^)) 



(69) 



Aj) 



log 2 (P) 

From (l69l we can then use Jensen's inequality and Proposition [3] in Appendix IVI-AI to obtain 



the lower bound from the theorem. 



M, 



cZF 
Gi 



> — lim 

P— >oo 



log 2 ( Ct'trE 



max i= i i2;i= i i2 ( sin 2 (Z(/i? ) , hi)) 



> — lim 

P— >oo 



loo- fT' r ^) r -l/(n-l) 
1Q g2 l°UB n -l C « 



log 2 (P) 

2 -min i , j (B^ ) )/(n-l)( 1 + (1))) 



log 2 (P) 



(70) 



minj ,• B, 
lim ■' ' 



G) 



p->oo (n-l)log 2 (P) 

(i) 

mm a Y . 



Multiplexing Gain Upper Bound: We now derive an upper bound for the MG, which means 
a lower bound for the interference. We consider wlog that TX 1 has the worst channel estimate 
and let a geni give perfect CSI to TX 2. We can now use Lemma [3] to derive an upper bound 
for the interference term. 



E 

=E 

=E 
>E 



log 2 ( \h^u 2 l2 



log 2 
log 2 
log 2 



h% (ul + E^u^ -u* 2 



cos(Z(/ifEi, — W2))||/ifEi|| ||u 2 1J — u* 2 



(i) 



cos(Z(/ifE 1 ,^ 1) -w*))||/ifE 1 



log 2 ( C'S i= max ii2 (sin 2 (Z(^ ) , %,)) 



(7 



1) 
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Inserting (I7TI ) into the MG expression (|64l ) yields 




E huW [log 2 (tf<2 matM^y (sinV^FU*)))) + 0{1) 




j. miDj = i i2 j = i,2 + log 2 (cn) + log 2 ( e ) 
p^o (n-l)log 2 (P) 



(72) 



mm <x>' 

i=l,2,j=l,2 



(i) 



and we have used Proposition @] in Appendix IVI-AI to bound the expectation of the logarithm of 
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